38 research outputs found
On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems
We present a new distributed fuzzy partitioning method to reduce the
complexity of multi-way fuzzy decision trees in Big Data classification
problems. The proposed algorithm builds a fixed number of fuzzy sets for all
variables and adjusts their shape and position to the real distribution of
training data. A two-step process is applied : 1) transformation of the
original distribution into a standard uniform distribution by means of the
probability integral transform. Since the original distribution is generally
unknown, the cumulative distribution function is approximated by computing the
q-quantiles of the training set; 2) construction of a Ruspini strong fuzzy
partition in the transformed attribute space using a fixed number of equally
distributed triangular membership functions. Despite the aforementioned
transformation, the definition of every fuzzy set in the original space can be
recovered by applying the inverse cumulative distribution function (also known
as quantile function). The experimental results reveal that the proposed
methodology allows the state-of-the-art multi-way fuzzy decision tree (FMDT)
induction algorithm to maintain classification accuracy with up to 6 million
fewer leaves.Comment: Appeared in 2018 IEEE International Congress on Big Data (BigData
Congress). arXiv admin note: text overlap with arXiv:1902.0935
Aprendizaje de distancias basadas en disimilitudes para el algoritmo de clasificación KNN
El objetivo de este proyecto es el de tratar de mejorar el algoritmo KNN (k vecinos más cercanos) sustituyendo la distancia Euclidea clásica por disimilitudes parametrizadas que serán ajustadas utilizando un algoritmo genético. La idea es que el algoritmo genético aprenda diferentes parámetros para luego calcular las distancias entre instancias utilizando esos parámetros, en vez de utilizar otras distancias clásicas como la Euclidea.
También consideramos la opción de poder realizar la selección de instancias y de atributos, de esta manera, el algoritmo genético podrá excluir las instancias que sean ruido. Al utilizar esta técnica se acelerara el cálculo de las distancias, ya que al disminuir el número de instancias y de atributos, se requieren menos cálculos a la hora de calcular las distancias.
Al final, realizaremos una comparativa con las diversas variantes que se puedan dar y el algoritmo KNN original, para ver si existe mejora a la hora de clasificar.Graduado o Graduada en IngenierÃa Informática por la Universidad Pública de NavarraInformatika Ingeniaritzako Graduatua Nafarroako Unibertsitate Publikoa
Biodiversidad genética de organismos marinos en el Parque Nacional de Cabrera: aplicaciones para la conservación
10 Páginas ; 1 Figura ; 2 TablasSe estudió la diversidad genética de las poblaciones de varias especies representativas del bentos marino
(esponjas, cnidarios, ascidias, equinodermos y peces) del Parque Nacional de Cabrera. Se utilizaron marcadores
moleculares de tasa de mutación alta y evolutivamente neutros (microsatélites) y, en algunos
casos, genes mitocondriales. Los muestreos se realizaron dentro del Parque y varias zonas de las islas de
Mallorca e Ibiza y a lo largo de la costa peninsular. Nuestros resultados indican que numerosas especies
de invertebrados sésiles y algunos peces, que forman una parte esencial de los ecosistemas rocosos del Parque,
están genéticamente aislados de las zonas adyacentes. Ello implica que las fases larvarias o adultas
de estas especies no provienen mayoritariamente de las zonas próximas al Parque (ni siquiera de la isla
de Mallorca) sino del propio Parque. Es decir hay un elevado nivel de autoreclutamiento. Por lo tanto, la
desaparición de estas poblaciones en el Parque tendrÃa una lenta recuperación.Peer reviewe
Diversity, structure and spatial distribution of megabenthic communities in Cap de Creus continental shelf and submarine canyon (NW Mediterranean)
The continental shelf and submarine canyon off Cap de Creus (NW Mediterranean) were declared a Site of
Community Importance (SCI) within the Natura 2000 Network in 2014. Implementing an effective management
plan to preserve its biological diversity and monitor its evolution through time requires a detailed character ization of its benthic ecosystem. Based on 60 underwater video transects performed between 2007 and 2013
(before the declaration of the SCI), we thoroughly describe the composition and structure of the main mega benthic communities dwelling from the shelf down to 400 m depth inside the submarine canyon. We then
mapped the spatial distribution of the benthic communities using the Random Forest algorithm, which incor porated geomorphological and oceanographic layers as predictors, as well as the intensity of the bottom-trawling
fishing fleet. Although the study area has historically been exposed to commercial fishing practices, it still holds a
rich benthic ecosystem with over 165 different invertebrate (morpho)species of the megafauna identified in the
video footage, which form up to 9 distinct megabenthic communities. The continental shelf is home to coral
gardens of the sea fan Eunicella cavolini, sea pen and soft coral assemblages, dense beds of the crinoid Leptometra
phalangium, diverse sponge grounds and massive aggregations of the brittle star Ophiothrix fragilis. The submarine
canyon off Cap de Creus is characterized by a cold-water coral community dominated by the scleractinian coral
Madrepora oculata, found in association with several invertebrate species including oysters, brachiopods and a
variety of sponge species, as well as by a community dominated by cerianthids and sea urchins, mostly in
sedimentary areas. The benthic communities identified in the area were then compared with habitats/biocenoses
described in reference habitat classification systems that consider circalittoral and bathyal environments of the
Mediterranean. The complex environmental setting characteristic of the marine area off Cap de Creus likely
produces the optimal conditions for communities dominated by suspension- and filter-feeding species to develop.
The uniqueness of this ecosystem and the anthropogenic pressures that it faces should prompt the development of
effective management actions to ensure the long-term conservation of the benthic fauna representative of this
marine area3,26